The Persian alphabet (), also known as the Perso-Arabic script, is the right-to-left alphabet used for the Persian language. It is a variation of the Arabic script with four additional letters: پ چ ژ گ (the sounds 'g', 'zh', 'ch', and 'p', respectively), in addition to the obsolete ڤ that was used for the sound . This letter is no longer used in Persian, as the -sound changed to , e.g. archaic زڤان > زبان 'language'.
It was the basis of many Arabic script used in Central and South Asia. It is used for both Iranian Persian and Dari: standard varieties of Persian; and is one of two official script for the Persian language, alongside the Cyrillic script-based Tajik alphabet.
The script is mostly but not exclusively right-to-left; mathematical expressions, numeric dates and numbers bearing units are embedded from left to right. The script is cursive, meaning most letters in a word connect to each other; when they are typed, contemporary automatically join adjacent letter forms. Persian is unusual among Arabic scripts because a zero-width non-joiner is sometimes entered in a word, causing a letter to become disconnected from others in the same word.
Under the influence of various Persian Empires, many languages in Central and South Asia that adopted the Arabic script use the Persian Alphabet as the basis of their writing systems. Today, extended versions of the Persian alphabet are used to write a wide variety of Indo-Iranian languages, including Kurdish, Balochi, Pashto alphabet, Urdu alphabet (from Classical Hindustani), Saraiki, Shahmukhi, Sindhi and Kashmiri. In the past the use of the Persian alphabet was common amongst Turkic languages, but today is relegated to those spoken within Iran, such as Azerbaijani, Turkmen language, Qashqai language, Chaharmahali and Khalaj language. The Uyghur language in western China is the most notable exception to this.
During the Soviet Union period many languages in Central Asia, including Persian, were reformed by the government. This ultimately resulted in the Cyrillic-based alphabet used in Tajikistan today. See: .
The names of the letters are mostly the ones used in Arabic except for the Persian pronunciation. The only ambiguous name is , which is used for both ح and ه. For clarification, they are often called (literally "-like " after , the name for the letter ج that uses the same base form) and (literally "two-eyed ", after the contextual middle letterform ـهـ), respectively. There are eight Persian letters that are mainly used in Arabic or foreign loanwords and not in native words: ث, ح, ذ, ص, ض, ط, ظ, ع and غ. These eight letters are also common used in proper names only. Unlike Arabic, the Persian language absolutely does not have pharyngealization at all. Although the letter غ is mainly used in Arabic loanwords, there are some native Persian words with this letter: آغاز, زغال, etc. The pronunciation of these letters in Persian can differ from their pronunciation in Arabic. For example, the letter ث is pronounced as /s/ in Persian, while it is pronounced as /θ/ in Arabic.
+ !Letter !Persian !Arabic | ||
/s/ | /θ/ | |
/h/ | /ħ/ | |
/z/ | /ð/ | |
ص | /s/ | /sˤ/ |
ض | /z/ | /dˤ/ |
ط | /t/ | /tˤ/ |
ظ | /z/ | /ðˤ/ |
ع | Glottal stop | /ʕ/ |
غ | or | /ɣ/ |
0 | همزه | Glottal stop | U+0621 | ء | |||||
U+0623 | ـأ | أ | |||||||
U+0626 | ـئ | ـئـ | ئـ | ئ | |||||
U+0624 | ـؤ | ؤ | |||||||
1 | الف | U+0627 | ـا | ا | |||||
2 | ب | U+0628 | ـب | ـبـ | بـ | ب | |||
3 | پ | U+067E | ـپ | ـپـ | پـ | پ | |||
4 | ت | U+062A | ـت | ـتـ | تـ | ت | |||
5 | ث | / | U+062B | ـث | ـثـ | ثـ | ث | ||
6 | جیم | / | U+062C | ـج | ـجـ | جـ | ج | ||
7 | چ | U+0686 | ـچ | ـچـ | چـ | چ | |||
8 | ح | () | / | U+062D | ـح | ـحـ | حـ | ح | |
9 | خ | U+062E | ـخ | ـخـ | خـ | خ | |||
10 | دال | U+062F | ـد | د | |||||
11 | ذال | / | U+0630 | ـذ | ذ | ||||
12 | ر | U+0631 | ـر | ر | |||||
13 | ز | U+0632 | ـز | ز | |||||
14 | ژ | U+0698 | ـژ | ژ | |||||
15 | سین | U+0633 | ـس | ـسـ | سـ | س | |||
16 | شین | U+0634 | ـش | ـشـ | شـ | ش | |||
17 | صاد | / | U+0635 | ـص | ـصـ | صـ | ص | ||
18 | ضاد | / | U+0636 | ـض | ـضـ | ضـ | ض | ||
19 | طا | / | U+0637 | ـط | ـطـ | طـ | ط | ||
20 | ظا | / | U+0638 | ـظ | ـظـ | ظـ | ظ | ||
21 | عین | , / | U+0639 | ـع | ـعـ | عـ | ع | ||
22 | غین | , | U+063A | ـغ | ـغـ | غـ | غ | ||
23 | ف | U+0641 | ـف | ـفـ | فـ | ف | |||
24 | قاف | U+0642 | ـق | ـقـ | قـ | ق | |||
25 | کاف | U+06A9 | ـک | ـکـ | کـ | ک | |||
26 | گاف | U+06AF | ـگ | ـگـ | گـ | گ | |||
27 | لام | U+0644 | ـل | ـلـ | لـ | ل | |||
28 | میم | U+0645 | ـم | ـمـ | مـ | م | |||
29 | نون | U+0646 | ـن | ـنـ | نـ | ن | |||
30 | واو | (in Farsi) | / / / | , , , (only word-finally) | U+0648 | ـو | و | ||
(in Dari) | / / / | , , , | |||||||
31 | ه | () | , or and (word-finally) | U+0647 | ـه | ـهـ | هـ | ه | |
32 | ی | / / / (Also / in Dari) | , , ( / in Dari) | U+06CC | ـی | ـیـ | یـ | ی |
Historically, in Early New Persian, there was a special letter for the sound . This letter is no longer used, as the -sound changed to , e.g. archaic زڤان /zaβān/ > زبان 'language'.
ڤ | ve | / / | ڤ | ـڤ | ـڤـ | ڤـ |
ݣ | ـݣ | ـݣـ | ڭـ |
ك | ـك | ـكـ | كـ |
ݿ | ـݿ | ـݿـ | ݿـ |
ی ه و ن م ل گ ک ق ف غ ع ظ ط ض ص ش س ژ ز ر ذ د خ ح چ ج ث ت پ ب ا ء | ||
• | Noto Nastaliq Urdu | |
• | Scheherazade | |
• | Lateef | |
• | Noto Naskh Arabic | |
• | Markazi Text | |
• | Noto Sans Arabic | |
• | Baloo Bhaijaan | |
• | El Messiri SemiBold | |
• | Lemonada Medium | |
• | Changa Medium | |
• | Mada | |
• | Noto Kufi Arabic | |
• | Reem Kufi | |
• | Lalezar | |
• | Jomhuria | |
• | Rakkas | |
The alphabet in 16 fonts: Noto Nastaliq Urdu, Scheherazade, Lateef, Noto Naskh Arabic, Markazi Text, Noto Sans Arabic, Baloo Bhaijaan, El Messiri SemiBold, Lemonada Medium, Changa Medium, Mada, Noto Kufi Arabic, Reem Kufi, Lalezar, Jomhuria, and Rakkas. | ||
1 dot below | ﮳ | ب | ج | |||||||||||||||||||
1 dot above | ﮲ | ن | خ | ض | ظ | غ | ف | ذ | ز | |||||||||||||
2 dots below | ﮵ | ی | ||||||||||||||||||||
2 dots above | ﮴ | ت | ق | ة | ||||||||||||||||||
3 dots below | ﮹ | پ | چ | |||||||||||||||||||
3 dots above | ﮶ | ث | ش | ژ | ||||||||||||||||||
line above | ‾ | گ | ||||||||||||||||||||
none | ء | ا | ی | ں | ح | س | ص | ط | ع | ک | ل | م | د | ر | و | ه | ||||||
madda above | ۤ | آ | ||||||||||||||||||||
Hamza below | ــٕـ | إ | ||||||||||||||||||||
Hamza above | ــٔـ | أ | ئ | ؤ | ۀ | |||||||||||||||||
The i'jam diacritic characters are illustrative only; in most typesetting the combined characters in the middle of the table are used.
Persian ''yē'' has 2 dots below in the initial and middle positions only. The [[standard Arabic]] version ي يـ ـيـ ـي always has 2 dots below.
064E | زبر (فتحه) | / | |||
0650 | زیر (کسره) | / | ; | ; | |
064F | پیش (ضمّه) | / | ; |
There is no standard transliteration for Persian. The letters 'i' and 'u' are only ever used as short vowels when transliterating Dari or Tajik Persian. See Persian Phonology
Diacritics differ by dialect, due to Dari having 8 distinct vowels compared to the 6 vowels of Farsi. See Persian Phonology
In Farsi, none of these short vowels may be the initial or final grapheme in an isolated word, although they may appear in the final position as an inflection, when the word is part of a noun group. In a word that starts with a vowel, the first grapheme is a silent which carries the short vowel, e.g. اُمید (, meaning "hope"). In a word that ends with a vowel, letters rtl=yes, rtl=yes and و respectively become the proxy letters for , and , e.g. نو (, meaning "new") or بسته (, meaning "package").
064B | تنوین نَصْبْ | ||
064D | تنوین جَرّ | Never used in the Persian language. Taught in Islamic nations to complement Quran education. | |
064C | تنوین رَفْعْ |
0651 | تشدید |
U+0622 | ـآ | — | آ | The final form is very rare and is freely replaced with ordinary alef. | ||||
or | U+06C0 | ـۀ | — | ۀ | Validity of this form depends on region and dialect. Some may use the two-letter ـهی or هی combinations instead. | |||
U+0644 (lām) and U+0627 (alef) | ـلا | لا | ||||||
U+0640 | — | ـ | — | This is the medial character which connects other characters |
Although at first glance, they may seem similar, there are many differences in the way the different languages use the alphabets. For example, similar words are written differently in Persian and Arabic, as they are used differently.
Unicode has accepted in the Miscellaneous Symbols range. "Miscellaneous Symbols". p. 4. The Unicode Standard, Version 13.0. Unicode.org In Unicode 1.0 this symbol was known as . "3.8 Block-by-block Charts" § Miscellaneous Dingbats p. 325 (155 electronically). The Unicode Standard Version 1.0. Unicode.org It is a stylization of الله () used as the emblem of Iran. It is also a part of the flag of Iran.
The Unicode Standard has a compatibility character defined that can represent ریال, the Persian name of the Iranian rial.For the proposal, see It proposes the character under the name of , which was changed by the standard committees to .
پ | U+067E | ||
(ch) | چ | U+0686 | |
(zh) | ژ | U+0698 | |
گ | U+06AF |
۰ | صفر sefr | U+0660 |
۱ | يک yek | U+0661 |
۲ | دو do | U+0662 |
۳ | سه se | U+0663 |
۴ | چهار čahâr | U+0664 |
۵ | پنج panj | U+0665 |
۶ | شش šeš | U+0666 |
۷ | هفت haft | U+0667 |
۸ | هشت hašt | U+0668 |
۹ | نه no | U+0669 |
ی | U+06CC | U+064A |
ک | U+0643 |
Arabic numerals | 4 | 9 | 10 |
Eastern Arabic | ٤ | ٩ | ١٠ |
Persian | ۴ | ۹ | ۱۰ |
Urdu alphabet | |||
Abjad numerals | ه | ي |
The Persian alphabet was introduced into education and public life, although the banning of the Islamic Renaissance Party in 1993 slowed adoption. In 1999, the word Farsi was removed from the state-language law, reverting the name to simply Tajik. the de facto standard in use is the Tajik Cyrillic alphabet, and only a very small part of the population can read the Persian alphabet.
|
|